Spatial Data Types
Spatial Data Types#
Dataset used is restaurants.csv from PostGIS in Action textbook.
from sqlalchemy import create_engine
import pandas as pd
# experimenting with holoviews
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')
engine_str = "postgresql+psycopg2://docker:docker@0.0.0.0:25432/restaurants"
engine = create_engine(engine_str)
%load_ext sql
%sql $engine.url
'Connected: docker@restaurants'
Import the restaurants dataset csv to postgres
%%sql
DROP TABLE IF EXISTS restaurants CASCADE;
CREATE TABLE restaurants (
id SERIAL,
name VARCHAR(3),
latitude REAL,
longitude REAL,
PRIMARY KEY (id)
);
* postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
Done.
Done.
[]
CSV is mounted in in postgres docker container.
%%sql
COPY restaurants(name, latitude, longitude)
FROM '/opt/data/restaurants.csv'
DELIMITER ',';
* postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
50002 rows affected.
[]
PostGIS extensions are available in the docker image I am using. We need to enable the base PostGIS extension for this database.
%%sql
SELECT *
FROM pg_available_extensions
WHERE comment SIMILAR TO '%%PostGIS%%';
* postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
11 rows affected.
| name | default_version | installed_version | comment |
|---|---|---|---|
| postgis_topology-3 | 3.2.1 | None | PostGIS topology spatial types and functions |
| postgis_sfcgal | 3.2.1 | None | PostGIS SFCGAL functions |
| postgis_tiger_geocoder-3 | 3.2.1 | None | PostGIS tiger geocoder and reverse geocoder |
| postgis_tiger_geocoder | 3.2.1 | None | PostGIS tiger geocoder and reverse geocoder |
| postgis-3 | 3.2.1 | None | PostGIS geometry and geography spatial types and functions |
| postgis_sfcgal-3 | 3.2.1 | None | PostGIS SFCGAL functions |
| postgis_topology | 3.2.1 | None | PostGIS topology spatial types and functions |
| postgis_raster | 3.2.1 | None | PostGIS raster types and functions |
| postgis_raster-3 | 3.2.1 | None | PostGIS raster types and functions |
| postgis | 3.2.1 | 3.2.1 | PostGIS geometry and geography spatial types and functions |
| pointcloud_postgis | 1.2.1 | None | integration for pointcloud LIDAR data and PostGIS geometry data |
%sql CREATE EXTENSION IF NOT EXISTS postgis;
* postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
Done.
[]
Using PostGIS geometry and geography types to store the coordinate information. The geography type has special methods available specifically for spherical coordinate systems.
%%sql
ALTER TABLE restaurants
ADD COLUMN IF NOT EXISTS geom geometry,
ADD COLUMN IF NOT EXISTS geog geography;
UPDATE restaurants
SET geom = ST_Point(longitude, latitude, 4326);
UPDATE restaurants
SET geog = geom::geography;
* postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
Done.
50002 rows affected.
50002 rows affected.
[]
Lets check out our data and confirm the SRID is correct.
%sql SELECT *, ST_SRID(geog) AS geog_srid, ST_SRID(geom) as geom_srid FROM restaurants LIMIT 10;
* postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
10 rows affected.
| id | name | latitude | longitude | geom | geog | geog_srid | geom_srid |
|---|---|---|---|---|---|---|---|
| 1 | BKG | 25.8092 | -80.24 | 0101000020E6100000000000205C0F54C0000000C027CF3940 | 0101000020E6100000000000205C0F54C0000000C027CF3940 | 4326 | 4326 |
| 2 | BKG | 25.8587 | -80.2094 | 0101000020E6100000000000C0660D54C0000000C0D3DB3940 | 0101000020E6100000000000C0660D54C0000000C0D3DB3940 | 4326 | 4326 |
| 3 | BKG | 25.8471 | -80.2415 | 0101000020E6100000000000C0740F54C000000080DBD83940 | 0101000020E6100000000000C0740F54C000000080DBD83940 | 4326 | 4326 |
| 4 | BKG | 25.8306 | -80.2661 | 0101000020E6100000000000C0071154C000000040A2D43940 | 0101000020E6100000000000C0071154C000000040A2D43940 | 4326 | 4326 |
| 5 | BKG | 25.9377 | -80.2449 | 0101000020E610000000000080AC0F54C0000000200DF03940 | 0101000020E610000000000080AC0F54C0000000200DF03940 | 4326 | 4326 |
| 6 | BKG | 29.187 | -82.1061 | 0101000020E610000000000060CA8654C000000040DF2F3D40 | 0101000020E610000000000060CA8654C000000040DF2F3D40 | 4326 | 4326 |
| 7 | BKG | 27.9256 | -82.7874 | 0101000020E6100000000000C064B254C000000020F4EC3B40 | 0101000020E6100000000000C064B254C000000020F4EC3B40 | 4326 | 4326 |
| 8 | BKG | 41.8426 | -87.7052 | 0101000020E61000000000000022ED55C000000060DAEB4440 | 0101000020E61000000000000022ED55C000000060DAEB4440 | 4326 | 4326 |
| 9 | BKG | 34.0096 | -85.9876 | 0101000020E6100000000000E0347F55C0000000A03A014140 | 0101000020E6100000000000E0347F55C0000000A03A014140 | 4326 | 4326 |
| 10 | BKG | 32.9516 | -117.062 | 0101000020E6100000000000C0F7435DC000000000CE794040 | 0101000020E6100000000000C0F7435DC000000000CE794040 | 4326 | 4326 |
Let’s visualize the points. I’d like to get familiar with the holoviews library, lets try it out.
with engine.connect() as con:
restaurants = pd.read_sql('SELECT ST_X(geom) as lon, ST_Y(geom) as lat, name FROM restaurants;', con)
vdims = [('name', 'Restaurant')]
ds = hv.Dataset(restaurants, ['lon', 'lat'], vdims)
points = hv.Points(ds)
points.opts(opts.Points(color='name', width=600, height=400, cmap='tab20', legend_position='top_left'))
How can I find the closest restaurants to Hunter College? As we are using the Geography type, we can find the minimum geodesic distance in meters.
%%sql
-- Hunter College coordinate retrieved from google maps
SELECT name, ST_Distance(ST_Point(-73.9644861858826, 40.768010416903955)::geography, geog) as distance_from_hunter
FROM restaurants
ORDER BY distance_from_hunter ASC
LIMIT 10
;
* postgresql+psycopg2://docker:***@0.0.0.0:25432/restaurants
10 rows affected.
| name | distance_from_hunter |
|---|---|
| KFC | 579.59174361 |
| MCD | 662.42972091 |
| MCD | 822.3617656 |
| WDY | 899.24830005 |
| MCD | 948.3807947 |
| MCD | 1038.67077814 |
| MCD | 1047.4709654 |
| MCD | 1322.93878124 |
| MCD | 1342.24631002 |
| TCB | 1418.26061805 |